[CUDA] Fix cuda provider fallback inconsistency #21425

tianleiwu · 2024-07-20T01:48:41Z

Description

Fix fallback setting (cuda still falls back to cuda).
Fix cuda provider fallback inconsistent with/without CUDA_PATH environment variable.
Add cuda and cudnn major version requirement in error message.

Example result in Windows:

>>> import onnxruntime
>>> ort_session = onnxruntime.InferenceSession("model.onnx", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
2024-07-19 17:43:44.2260019 [E:onnxruntime:Default, provider_bridge_ort.cc:1972 onnxruntime::TryGetProviderInfo_CUDA] D:\onnxruntime\onnxruntime\core\session\provider_bridge_ort.cc:1636 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\.conda\envs\py310\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

2024-07-19 17:43:44.2312351 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:970 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9.* and CUDA 12.*, and the latest MSVC runtime. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
>>> ort_session
<onnxruntime.capi.onnxruntime_inference_collection.InferenceSession object at 0x0000016BB2DF7D60>
>>> ort_session.get_providers()
['CPUExecutionProvider']

Example result in Linux:

>>> import onnxruntime
>>> ort_session = onnxruntime.InferenceSession("resnet50-v2-7.onnx", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
2024-07-20 20:33:26.486974543 [E:onnxruntime:Default, provider_bridge_ort.cc:1972 TryGetProviderInfo_CUDA] /work/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1636 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.12: cannot open shared object file: No such file or directory

2024-07-20 20:33:26.487034646 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:961 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9.* and CUDA 12.*. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
>>> ort_session.get_providers()
['CPUExecutionProvider']

Motivation and Context

#21424

Fix cuda fallback

aef98d3

jywu-msft previously approved these changes Jul 20, 2024

View reviewed changes

include directory of cuda.h

2d18ce5

tianleiwu dismissed jywu-msft’s stale review via 2d18ce5 July 20, 2024 20:46

tianleiwu requested a review from jywu-msft July 22, 2024 17:19

jywu-msft approved these changes Jul 23, 2024

View reviewed changes

tianleiwu merged commit 2b7e2a5 into main Jul 23, 2024
95 of 98 checks passed

tianleiwu deleted the tlwu/fix_cuda_fallback branch July 23, 2024 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Fix cuda provider fallback inconsistency #21425

[CUDA] Fix cuda provider fallback inconsistency #21425

tianleiwu commented Jul 20, 2024 •

edited

Loading

[CUDA] Fix cuda provider fallback inconsistency #21425

[CUDA] Fix cuda provider fallback inconsistency #21425

Conversation

tianleiwu commented Jul 20, 2024 • edited Loading

Description

Motivation and Context

tianleiwu commented Jul 20, 2024 •

edited

Loading